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The latest results are presented of the search for new physics in inclusive dijet events recorded with the ATLAS 
detector. The search for resonances in the dijet mass spectrum is updated with 0.81 fb~^ of 2011 data. The 
latest analysis of dijet angular distributions, with 36 pb~^ of 2010 data, is also presented. In-depth information is 
provided about the model-independent search for resonances. Limits are provided for excited quarks, axigluons, 
scalar color octets, and to Gaussian signals that allow to set approximate limits in a model-independent way. 



1. Introduction 

<N : , 

_ , A search for new physics in events with at least two hadronic jets is motivated by numerous proposed exten- 
^j^' sions of the Standard Model (SM), ranging from quark structure to extra spatial dimensions. Experimentally, 
' the large number of dijet events allows for a data-driven background estimation from very early^. 
ATLAS currently pursues two complementary searches in dijet events: 

' 1. The search for resonances in the mass {mjj) spectrum of the two highest-pT (a.k.a. "leading") jets, 

^ ' 2. The search for an enhancement of leading jets produced at similar rapidities, i.e. at small |Ay|. 0, [1] 
I 

O-i, For brevity, the former analysis is called "Resonance Search", and the latter "Angular Search". Both are 
^ • sensitive to very similar new physics that would appear in both mjj and Ay. 

I i' Section [5] offers an overview of the Angular Search and its latest results with 36 pb"-'^ of 2010 data. Section|3] 
contains the latest update of the Resonance Search, with 0.81 fb^^ of 2011 data. In Section [01 the opportunity 
is taken to offer in-depth information about the method used to search for anomalies in a model-independent 
way, explaining hypertests and the BumpHunter 4]. Section |3^ summarizes the limits set to specific models, 
and Sec. 13.31 the mo del- independent limits. 



(N ■ 

q{ 2. Summary of 2010 results 

o : 

' This Section summarizes the results of where both dijet searches were presented with 36 pb"'^ of 2010 
I ' data. Since this has been the latest update of the Angular Search, more emphasis will be given to it here, 
J> . deferring the Resonance Search for later. 

The event selection, which is detailed in ensures good data quality, well- measured jets, and constant 
trigger efficiency. The basic observable of the Angular Search is 

X^el^^l, (1) 

where Ay is the rapidity^ difference between the leading and the subleading jet, i.e. the jet with the highest 



and second highest pT in each event. Fig. 1(a) shows the distribution of data in x, in 5 broad intervals (a.k.a. 
"bins") of mjj. The data are statistically compatible with the expectation, which is computed using Pythia 
Monte Carlo (MC) to model QCD, and NLOJET-|--|- for next-to-leading-order (NLO) corrections. A derived 
observable is F^, which is the fraction of events at x < e^'^. The choice of e^'^ is based on optimization of 
the sensitivity to quark contact interactions. New physics would appear as an in crease in F^. When F^ is 
computed in bins of mjj , the result is an spectrum which can indicate new physics by an increase of F^ {mjj ) 



in mjj bins that contain significant amounts of signal. Fig. 1(b) shows the observed and expected F^{mjj) 
spectra. 

No significant discrepancy is observed. The p- values of goodness-of-fit tests, which use as test statistic the neg- 
ative logarithm of the likelihood (L) of the data in all bins assuming the background (— log L(data|background)), 



ifndicatively, [l|,[l was the first LHC search for exotic phenomena to reach beyond Tevatron limits, with only 0.3 pb . 
^Rapidity (j/) follows the usual definition with respect to the beam axis (2:): y = \n -^3^. 
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(a)The observed (filled points) and expected (histograms) x (b)The observed (filled points) and expected (histograms) 
distribution. F~^(mjj) distribution. 



Figure 1; Distributions of x (left) and F^{mjj) (right). The dashed histogram is an example of quark compositeness 
signal MC. 



are of order 30% or more. This level of agreement is observed in all nijj bins where x examined, and in the 
F^{mjj) spectrum (Fig. [T|). 

Since agreement was found with the SM, limits were set to various new physics models. Three angular 
observables were used: The full F^imjj) spectrum, the F~^ of events with rrijj > 2 TeV, and the differential 
distribution of events in x, i.e. in events with rUjj > 2 TeV. Table H] summarizes all results, including those 
of the Resonance Search, which have now been superseeded by 2011 data, as will be shown. All Frequentist 
limits in [s*] apply the classical (CLg+b) Neyman construction p\ with test statistic being the logarithm of the 
ratio of likelihoods of the hypothesis with and without signal. Systematic uncertainties are mainly from the 
jet energy scale, the renormalization and factorization scales (/ik,/if), and parton density functions (PDF). 
These uncertainties are propagated to the limits by random sampling of the nuisance parameters, making thus 
the Neyman band wider, which is known to not be fully conistent with the Frequentist framework, but is a 
practice accepted as common. Regarding in particular the limit on contact interaction scale A using F-^{mjj), 
the observed Frequentist limit of 9.5 TeV is much above the expected limit. However this is not the case for the 
Bayesian limit using the same F^{mjj), or for the Frequentist limits using the other two angular observables 
(see last 4 rows of Table |T| . 



3. Dijet Resonance Search with 0.81 fb^ of 201 1 data 

This section is a summary of 11], that highlights some of the technical details of the analysis. 

The main observable is rujj, the mass of the system of the two leading jets in px- Events are selected to 
ensure good beam and detector conditions, well-reconstructed leading jets, no potential of accidental swapping 
of the order of jets in px, and constant trigger efficiency in rajj. To suppress the QCD background at high 
mass, the |Ai/| is required to be less than 1.2, in accordance with the Angular Analysis (Section [J). 

The mjj spectrum is compared to the expected, searching for a local enhancement of the cross-section around 
some value of rrijj, which is what "resonance" means in this context. The comparison is made by a hypertest 
known as BumpHunter Bayesian lower limits are set, at 95% credibility level (C.L.), on the mass of excited 
quarks Q , axigluons [1] , and scalar color octets @ . Finally, model- independent limits are given, which can be 
used to approximate the limit to virtually any resonant signal decaying into two jets. 

The expected spectrum is obtained by fitting to the data the function 

,./ N (1 ~ , m,-,- , . 

/(^) = Po y,,+p,L ^ ^liere X ^ -|. (2) 
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Table I: Lower limits, at 95% confidence level (or credibility level, for Bayesian limits) (C.L.), set by the Resonance and 
Angular Search in 36 pb~^ of 2010 data 5]. Results of the Angular Search are distinguished by the observable listed 
in the first column. The second column lists which limits are Frequentist and which Bayesian (with a constant prior in 
signal cross-section for the Resonance Search, and in for the Angular Search). 



Analysis / observable 


Method 


95% C.L. Limits (TeV) 






Expected 


Observed 


Excited Quark q* 


Resonance in nijj 


Bayesian 


2.07 


2.15 




Frequentist 


2.12 


2.64 


Randall-Meade Quantum Black Hole for n = 6 


Resonance in rrijj 


Bayesian 


3.64 


3.67 




Frequentist 


3.49 


3.78 


for mjj > 2 TeV 


Frequentist 


3.37 


3.69 


!^ for mjj > 2 TeV 


Frequentist 


3.46 


3.49 


Axigluon 


Resonance in nijj 


Bayesian 


2.01 


2.10 


Contact Interaction A 




Frequentist 


5.7 


9.5 




Bayesian 


5.7 


6.7 


for nijj > 2 TeV 


Frequentist 


5.2 


6.8 


^ for rrijj > 2 TeV 


Frequentist 


5.4 


6.7 



This parametrization has been shown to fit well the QCD prediction given by Pythia, Herwig, Alpgen, and 
NLOJET++ [10]. Figure |2(a)| shows an example of fitting the Pythia QCD prediction, after full ATLAS 
detector simulation, without NLO corrections. 

The fit is performed in such a way that, if the data contains non- negligible signal in any rujj interval, that 
will be omitted from the fit, thus obtaining the background from the sidebands. The algorithm that determines 
if that is necessary searches for any mjj window that may be responsible for a globally poor fit (x^ test p- value 
< 1%). No such interval was found in the data, so, the whole spectrum was fitted, giving a p- value much 
greater than the 1% threshold. 

Figure 2(b) shows the data, along with the fitted background, and three examples of q* signal added to it. 
The overall consistency of the data with the fit is quantified by the p- value of the negative logarithm of the 
likelihood of the data when the background is expected (— log L(data| background)). This is a generalization of 
the t6st that accounts correctly the Poisson probability in bins with low background. That p-value is about 
13%, indicating consistency. 



3.1. The model-independent search for resonances 

The BumpHunter algorithm [4] is used to look for resonances in the mjj spectrum in a model-independent 
way, which offers high sensitivity, and takes correctly into account the trials factor (a.k.a. "look elsewhere" 
effect), namely the effect of examining various positions of the mjj spectrum before eventually finding something. 
The general way to account for the trials factor is to construct a hypertest, and the BumpHunter is just one 
such Iwpertest. 

In |J| one can find a detailed account of the trials factor, the definition of hypertests, and details about the 
BumpHunter and other similar hypertests, such as the TailHunter. A summary is given here. 

A hypertest is a hypothesis test, much like the well-known or the Kolmogorov-Smirnov (KS) test. It has a 
test statistic, and a corresponding p-value, which is, as usual, the frequency with which a more signal-like test 
statistic than the one observed in data would occur under the background-only hypothesis. The difference is 
that, in a hypertest, the test statistic itself is a p-value'^, therefore the p- value of a hypertest is a p- value of a 



As we will see, it's actually a monotonically decreasing function of a p-value, like (— logp-value), for reasons of convention. 
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Figure 2: Left: The function of eq. [5] used to fit Pythia QCD, after ATLAS detector simulation. Right: The data 
(black markers), the fitted background (solid histogram), and three examples of g* signal (hollow markers). The red 
bars indicate the significance of each excess (positive bars) or deficit (negative bars) of data, expressed in standard 
deviations In bins with small expected number of events, where the observed number of events is similar to the 

expected, the Poisson probability of a fluctuation at least as high (low) as the observed excess (deficit) can be greater 
than 50%, resulting in a negative significance when expressed in standard deviations. Drawing negative values would 
obscure the current intuitive convention that data excesses appear as positive bars and data deficits as negative bars, and 
the amplitude of each bar is proportional to the significance of each deviation. Such bins present no statistical interest, 
so, for simplicity, bars are not drawn for them. 



p-value.^ Each hypertest contains in its definition an ensemble of hypothesis tests. These are the hypertest's 
"members" , and their multitude is responsible for the trials factor. The hypertest combines the results of its 
members^. The hypertest's test statistic is, by convention^, the negative logarithm of the smallest p- value 
of all tests in the ensemble. The p- value of the hypertest is the frequency by which, in the background-only 
hypothesis, there would be any test in the ensemble with a p-value at least as small as the smallest p-value 
found in the ensemble of tests when they were performed on the actual data. 

The BumpHunter, as implemented in this analysis, is a hypertest whose ensemble of combined tests is 
defined after forming all possible mjj intervals in the binned vrijj spectrum, and performing in each interval a 
simple event-counting hypothesis test. The bin sizes increase at higher rujj, following detector resolution, such 
that any new physics signal would populate at least two consecutive bins. For this reason, we do not consider 
mjj intervals narrower than two bins. Since we are only interested in excesses of data, the test statistic of the 
hypothesis test performed in each rujj interval takes its minimum value, which is by convention, if the data 
in the interval [D] are fewer than expected {B). li D > B, then the test statistic increases monotonically with 
D — B, for example as {D — B)^, or {D — _B)^°", or — log(-^e~^); it doesn't matter exactly how the test 
statistic increases'^, because all the hypertest needs is the p-value of this test statistic, which in any case is the 
Poisson probability of observing at least D events when B are expected: X)^_d TW^^^ ■ 

An equivalent, more procedural (and maybe more intuitive) way to describe the BumpHunter hypertest is 
the following algorithm: 



''Contrast that to the test or the KS test, where the test statistic is just a metric of discrepancy, like the or the biggest 
difference between two cumulative distributions. 

hypertest can contain even /ii/pertests; nothing changes. Combining simple hypothesis tests creates a hypertest. Combining 
hypertests creates just another hypertest. A hypertest that contains just one member test is a trivial hypertest, whose p- value is 
identical to the p-value of its member. 

^The point of this convention is to have a test statistic which increases with increasing discrepancy. 

'^It doesn't even matter if the test statistic is {D — B)^ in one interval and (D — B)^^^ in another. 
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1. Scan the rrijj spectrum, counting events in all intervals of at least 2 bins, and in each interval compute 
the Poisson p- value of seeing at least as many data as observed, given the expected number of events. 

2. Keep the smallest Poisson p-value found in the actual data. This number is the observed test statistic of 
the BumpHunter. 

3. Generate many spectra of background-only pseudo-data, and scan them too, noting the smallest Poisson 
p-value in each one. Note that, since the data were compared to the background fitted to them, the fit 
must be repeated to each individual pseudo-spectrum before scanning. This wouldn't be necessary if the 
background was not obtained by a fit. 

4. Measure how frequently a pseudo-spectrum returns a minimum Poisson p-value that is smaller than the 
observed test statistic (step[2|). This frequency is the p- value of the BumpHunter hypertest. 

If the BumpHunter's p-value is very small® it is a fail-safe statement to say that there is a local excess 
in the data, which is incompatible with the background, because the probability of such an excess being pure 
coincidence is equally small. It also helps that we know which member test (i.e. which nijj interval) returned 
the smallest p- value, because that is where the signal is, so we are pointed at it.^ 

By its very construction, a hypertest accounts for the trials factor, since it considers every member test in 
every pseudo-experiment. 

It is very important that the hypertest's test statistic is the largest — log(p-value) in the ensemble, e.g. the 
largest 




log E ' (3) 



and not the largest test statistic in the ensemble, e.g. the largest 

-log(^e-^), or(I?-i?r<'. (4) 

It would have been wrong to use directly the test statistic of the most discrepant member test, because test 
statistics of different member tests follow different distributions under the background-only hypothesis, and are 
not comparable. For example, in the case of the BumpHunter, this mistake could have led to a hypertest 
with strong bias towards low rujj regions with high background (B), because, e.g., (1010 — 1000)^°" is greater 
than (15 — 10)^°°, even though it's obvious that observing 1010 events instead of 1000 is less significant than 
observing 15 instead of 10. The right way to combine dissimilar hypothesis tests is by comparing their p- values, 
not their test statistics. Hypertests, by construction, avoid this pitfall. 

Figure ini demonstrates the sensitivity of the BumpHunter to a test signal injected at 2 TeV, with Gaussian 
shape of standard deviation 100 GeV (Fig. 3(a)). Figure [3(b)| compares the power of a variety of hypothesis 
tests, defined as the probability a test has to observe a p- value less than 5% (equivalent to a 1.6cr effect)^", 
as a function of the expected number of injected signal events. The faster the power increases, the higher the 
sensitivity of a test to this signal. 

In Fig. |3(b)[ two of the compared tests, the BumpHunter and the TailHunter Q, are hypertests with 
no knowledge of the injected signal. The rest are not hypertests. Three of them examine the whole spectrum: 
the KS test, the test, and its generalization that uses the test statistic — log(nbins "^s"^)- Finally, two 
hypothesis tests are constructed exploiting knowledge of the signal, which in reality is not possible, unless one 
knows in advance what is about to be discovered. One of these two tests is aware only of the signal location. 



*By pure convention, a p-value of 1.35 X 10~^, corresponding to a 3cr effect [12| |. is considered "evidence", and a p-value of 
2.87 X 10~^, corresponding to a 5(T effect, is considered "discovery". One may argue that the 5(T requirement is unnecessarily 
conservative, especially for the p-value of a hypertest where the trials factor is already accounted for. 

^However, it should not be thought that the BumpHunter is performing an inference of the actual mass or width of a new 
particle. It is just a frequentist test of the background-only hypothesis, which returns a p-value; not any confidence interval or 
posterior probability density for the signal parameters. The interval pointed at by the BumpHunter could be used as input to a 
formal inference procedure, which may be Bayesian or Frequentist, and which will have to assume some more information about 
the signal, e.g. the shape of its distribution in rrijj. 

^"This is an arbitrary choice that doesn't affect the conclusions of this comparison. 
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Figure 3: Left: The toy signal (blue, Gaussian at 2 TeV with (t=100 GeV) injected on top of a background similar to 
QCD. Right: The power (see text) of the following tests, as a function of injected signal events: The BumpHunter (filled 
squares), the TailHunter (empty triangles), the KS test (stars), the test (empty squares), its generalization that 
uses the negative logarithm of the likelihood of the data under the background-only hypothesis (empty circles) , and two 
tests that know something about the injected signal: a test that performs event-counting in the windw which contains 
68% of the signal (filled triangles), and a test that knows not only the position, but also the exact shape of the signal, 
and computes a likelhood ratio test statistic (filled circles). 



so, it performs a counting test just in the bins where 68% of the signal is expected. The other knows the exact 
shape and position of the signal, but not how much it is, so it uses as test statistic the log ( 

L(data|no signal) j ' 

where s is the best fitting amount of the known signal. 

The comparison shows that, among the tests with no prior knowledge of the signal, the BumpHunter is by 
far the most sensitive, followed by the TailHunter. This is true despite the trials factor, which does not apply 
to other tests. The KS test is the least sensitive to this signal. The most sensitive test of all is the one which 
knows the exact shape and position of the signal. The next most sensitive is the one that knows its position. 

3.1 .1 . Results of the search for resonances 



Figure 4(a) shows the most interesting interval identified by the BumpHunter in the actual data. Figure [T(b)[ 



known as "BumpHunter tomography", shows X^^d "Ir^ ^ ^-"^ each mjj interval that contains an excess of 
data {D > B), which allows us to see that there is no interval that gets anywhere close to being significant, even 



without accounting for the trials factor. Figure 4(c) shows the distribution of the BumpHunter test statistic 
under the background-only hypothesis, and the blue arrow indicates the observed test statistic value. From that 
it is clear that the observed spectrum does not contain any significant bump anywhere. The BumpHunter's 
p- value is 62%. 

No significant excess has been found in nijj in 0.81 fb^^ of data. 



3.2. Limits to specific models 

Upper limits are set to the accepted cross-section (cr x A) of excited quarks (q*), axigluons {A), and scalar 
color octets (s8). Details about how these models were simulated can be found in The limits are Bayesian, 
at 95% credibility level, with a constant prior in a x A. The observed and expected limits are compared to the 
corresponding theoretical predictions in Fig. [S] Lower mass limits are summarized in Table HIl 

The main systematic uncertainties, which have been convolved to the limits, are the jet energy scale (JES) 
uncertainty (2 to 4%, depending on px and ry), the background fit uncertainty (ranging from less than 1% at 
the beginning of the spectrum, and reaching about 20% at 4 TeV), and the luminosity uncertainty of 4.5%. The 
jet energy resolution (JER) uncertainty has a negligible effect on the limits and is ignored. 
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Figure 4: Results of the search for resonances. Left: The most significant excess found in the data, indicated by the blue 
vertical lines. Middle: The Poisson p-value (JUJ^d each interval with an excess (D > B). Each mjj interval 

is indicated by its position in the horizontal axis, and its p-value by the position on the vertical axis. Even the most 
significant interval, without any trials factor to be considered, has a p-value of about 0.09, which is very insignificant. 
Right: The distribution of the BumpHunter test statistic under the background-only hypothesis, compared to the 
observed test statistic (blue arrow). The BumpHunter p-value is 62%. 




Figure 5: Limits to excited quarks and axigluons (left), to scalar color octets (middle), and to Gaussian-distributed 
signals (right) of mean values between 900 GeV and 4 TeV and standard deviations (cTGaussian) from 5% to 15% of the 
mean value. 



3.3. Model-independent limits 

In addition to specific theories, limits are also set to a collection of hypothetical signals that are assumed to 
be Gaussian-distributed in rrijj, with means ranging from 0.9 to 4.0 TeV, and standard deviations from 5% to 
15% of the mean. These limits include the same luminosity and background fit systematic uncertainties. Since 
the Gaussian rujj distributions do not result from actual jets, to convolve the JES uncertainty the mean of the 
Gaussian is given a 4% uncertainty, which is conservatively larger than the shift observed in q* templates when 
jet pt was shifted by the JES uncertainty. 



Table IL The 95% CL mass lower limits for the models examined in this study. 



Model 


95% CL Limits (TeV) 




Expected Observed 


Excited Quark 


2.77 2.91 


Axigluon 


3.02 3.21 


Colour Octet Scalar 


1.71 1.91 
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The results are shown in Fig. 5(c) These limits can be used to set approximate limits on any new physics 
model that predicts some peaking signal in nijj , because almost every signal excess can be approximated by a 
Gaussian at some level. A procedure is given in [llj, to make the approximation. The non-Gaussian tails of 
the signal need to be removed, to make the Gaussian shape a better approximation to the remaining signal. 
Removing the tails results in lower signal acceptance. Removing the tails does not affect the much the mass 
limits, because the tails are in background-dominated regions; the theoretical cross-section of the new particle is 
reduced by the acceptance of keeping just the core of its vrijj distribution, but the a x A upper limit is reduced 
by roughly the same fraction, so, the theory and the limits continue to intersect at the same mass as if the tails 
had not been removed. 



4. Conclusion 

The latest results were shown, from the new physics searches in the dijet angular and mass distribution. The 
latter was updated with 0.81 fb^^ of 2011 data. No sign of new physics was found. Limits to q* improved by 
about 1 TeV since 2010, and are the most stringent ones currently. Limits to Gaussian signal distributions have 
been updated, as a tool to compute approximate limits to more theoretical models. 
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